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Project Description 
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My mission this summer in the 
Information Security Office (IT-B) 
was to develop a program which 
would separate the information in 
the network logs. 



3 


PA CERE 


Purpose of the Project 
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The network logs were getting longer and longer as the advertisements made their way to 
the most visited web pages. The purpose of this project was to simplify the view of the 
networks logs (data reduction) so that it would be easier to analyze the information. 
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Research Techniques 
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Step 1 : Take a look at Uniform Resource Locator(URL) logs. 



Step 2: Learn about PERL 
and it’s characteristics. 
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Research Techniques 
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Step 3 : Research web page content 
(including advertisements) and how it 
works. 

Step 4: Get familiar with PERL code. 
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Step 5: Determine if a link is actual 
content or an advertisement. 

Step 6: Write a software program 
which separated the content from the 
ads. 


#!/usr/bin/perl 

print "Content-type: text/html \n\n"; #HTTP HEADER 

$somenumber = 4; 

$myname = “some string"; 

@array = ("value00","value01","value02"); 

%hash = ("Quarter", 25, "Dime", 10, "Nickle", 5); 

## OR ## 

my $somenumber = 4; 

my $myname = "some string"; 

my @array = ("valueOO", "valueOl", "value02"); 

my %hash = ("Quarter", 25, "Dime", 10, "Nickle", 5); 






Perl Script Example 
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r example get url fields1.txt - Notepad 


File Edit Format View Help 

#! perl 
# 

# 07-jul-2011 carlimar Collazo 

#use perl to read a url log delimited with a tab 

# 

$tab = "\t"; 

$input_file = @argv[0] ; 

$output_file = "output.txt"; 

open (out, "> $output_file") || die "unable to open $output_file: $!\n"; 
open (IN," $i nput_f i 1 e") || die "Unable to open $input_file: $!\n"; 

# 

while (<IN>) { 
chop; 

$curr_line = 

(Sip, Sdate, Saction, Sw4) = split(/$tab/,$curr_line); 

(Surl , Sremainder) = split(/ /, $w4); 

($dl, $d2, Sdomain) = spl it (/\//, Sur 1 ) ; 

# print (OUT "Sip$tab$date$tab$action$tab$w4\n"); 

print (out "ip = $ip\n"); 
print (OUT "date = $date\n"); 
print (OUT "action = $action\n"); 
print (OUT "url = $url\n“); 

# print (OUT "remainder = $remainder\n"); 

# print (out "domain = $domain\n"); 

# print (OUT "$domain\n"); 

print (OUT " \n"); 

> 

close(ouT); 
close(iN); 
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Experience with Mentor and Co-workers 
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My experience with my mentor was awesome. He explained things in a way I could 
understand. He had confidence in my abilities. This helped me and encouraged me to 
work hard until I reached the expected goal. My co-workers were always willing to 
help, explain and respond questions. 
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Experience with Mentor and Co-workers 
Information and Technology Security (IT-B) 



IT and Comm Services 



Experience with Mentor and Co-workers 
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Knowledge Gained 
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C:\WINNT\system32\cmd.exe H0 

Microsoft Windows XP [Uersion 5.1.2600] 

<G> Copyright 1985-2001 Microsoft Corp. 

D:\Docunents and Sett ings\cco llazl >cd Desktop 

D:\Documents and Sett ings\cco llazl\Desktop>cd "Perl connands” 

D:\Docunents and Sett ings\cco llazl\Desktop\Perl connands>cd ARGU 

D:\Docuiients and Sett ings\cco llazl\Desktop\Perl connands\ARGU> exanple_get_url_f 
ieldsl.pl urls_201.txt 

D:\Docuroents and Sett ings\cco llazl\Desktop\Perl c o roman ds \fl RGU >e xanp le _ge t _ur l_f i 
eldsl.pl url_128.217.135.57_2O110707.txt 

D:\Documents and Settings\ccollazl\Desktop\Perl coronands\ARGU > 
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The statement of purpose is to analyze network monitoring logs to support the computer 
incident response team. Specifically, gain a clear understanding of the Uniform Resource 
Locator (URL) and its structure, and provide a way to breakdown a URL based on 
protocol, host name domain name, path, and other attributes. Finally, provide a method to 
perform data reduction by identifying the different types of advertisements shown on a 
webpage for incident data analysis. 

The procedures used for analysis and data reduction will be a computer program which 
would analyze the URL and identify and advertisement links from the actual content 
links. 

My method of data collection will be based on research about known the advertisement 
sites and understand the way they are written. In addition, a large part of the data 
collection will be relying on statistical analysis of the actual log data in order to identify 
additional advertisement sites. 

This project is going to be a big learning experience; the program is going to help the 
Information Technology Security Office identify the ads that may clutter the actual 
content data. This would make the process simpler because you would have a program 
which recognizes the ads right away. 

The information contributions are provided by the IT Security Office (IT-B). 
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